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Abstract: 

We expand on a recent development by Hardy, in which quantum mechan- 
ics is derived from classical probability theory supplemented by a single new 
axiom, Hardy's Axiom 5. Our scenario involves a 'pretend world' with a 'pre- 
tend' Heisenberg who seeks to construct a dynamical theory of probabilities 
and is lead - seemingly inevitably - to the Principles of Quantum Mechanics. 
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In two recent papers [1,2], Hardy shows how classical probability theory 
morphs directly into quantum theory, complete with full instructions for mea- 
surement and interpretation. Hardy's demonstration employs five axioms. 
The first four establish the basic classical probability theory in the vocabulary 
of a generalized Stern-Gerlach apparatus, which is the paradigm of quantum 
mechanics. Hardy introduces systems of iV states; each of dimension 
(=N classically, =N 2 quantum mechanically); with subsystems M < N; and 



composite systems with N = N^Nb and dimension K = K^Kb] and finally, 
Hardy introduces his crucial fifth axiom. 

Hardy's Axiom 5 requires that there exist a continuous reversible transfor- 
mation between any two pure states. Hardy emphasizes the marvelous purity 
of his derivation encompassed in the fact that the key word continuous is the 
sole genetic marker responsible for the profound distinction between classical 
probability theory and quantum mechanics. 

Our purpose here - following Hardy and essential earlier contributions 
by Caticha [3]- is also to start with classical probability theory but to 'de- 
rive' quantum mechanics in a less formal, austere and formidable way than 
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Hardy. We hope to 'discover' in a 'pretend world' a pathway equivalent to 
that of Hardy, which might appeal to the more heuristic and intuitive tastes 
of all those primarily interested in the physical as distinct from the more 
mathematical aspects of quantum mechanics. 

Of course, quantum mechanics must survive all these shenanigans com- 
pletely unscathed. What these efforts do hope to provide is not new physics, 
but perhaps a raison d'etre, an 'understanding', of quantum mechanics - an 
'interpretation' of quantum mechanics - in line with the overwhelming, but 
not yet and probably never unanimous, consensus expressed by Fuchs and 
Peres [4] and many others, and still passionately debated [5]: specifically, 
that quantum mechanics is a theory of information propagation. To say that 
quantum mechanics is 'a' theory is to seriously understate the case. Quan- 
tum mechanics will be seen to be 'the' fundamental theory of information 
propagation. The wave function is found to be the necessarily subjective 
encoding of information - and when information changes, the wave function 
must be changed accordingly. This interpretation puts an end at last to all 
quantum paradoxes. They are now seen to be the result of a too literal - 
even naive - faith in our introduction to the subject via wave mechanical 
extensions of the classical objective world view. 

2 Quantum Mechanics from Information Theory 

Here we present a 'derivation' of quantum mechanics starting from infor- 
mation theory. We find that it is surprisingly straightforward to reverse the 
roles of quantum mechanics and information. The logic sequence is no longer 

a) the discovery of the Heisenberg matrix mechanics underlying classical me- 
chanics; and then of 

b) Schroedinger wave mechanics; followed by 

c) the verification of quantum mechanics in many stationary state situations; 
but 

d) still plagued by paradoxical time- dependent situations; necessitating 

e) a host of ad hoc 'interpretations' of quantum mechanics to accommodate 
all paradoxes; and finally 

f) resolution of all confusion by recognition of the role of the wave-function 
of quantum mechanics as the fundamental encoding and propagation of sub- 
jective information. 
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In fact the logic sequence can usefully be completely reversed. We start, 
of course, with the essential advantage of knowing the desired results, and of 
having the necessary language and mathematical techniques already familiar 
from almost a century of development of quantum mechanics. We are then 
able to 'derive' quantum mechanics from the requirement that the informa- 
tion entropy of an isolated system should remain unchanged as it evolves in 
time. We imagine a 'pretend world' in which we 'discover' quantum mechan- 
ics embedded in classical information probability. 

We must be directly driven in this 'pretend' world by a fundamental pos- 
itivist philosophy to invent whatever tools are necessary to keep moving and 
- hopefully - to keep making progress. These tools - which are dramatically 
suggested, even demanded, by the formalism - include unitary operators, 
hermitian generators, commutator brackets; all operating in a Hilbert space; 
then canonical commutation relations of g's and their newly introduced p's; 
then Heisenberg; then Schroedinger; then classical mechanics and eventually 
even the Lagrangian and the Principle of Least Action! Information theory 
explains the previously inexplicable: Why a Principle of Least Action? 

The first step is to introduce the entropy of a macro-ensemble 

S = -trace {P In P} 

where P is the probability of a micro-ensemble. None of these quantities 
will be fully specified but rather they will be explored. S is chosen to have 
the appropriate limiting values: S — if P — 1 for the simplest situation of 
a 'pure state'. Otherwise S > 0, and S — In N for a completely random 
mixture of iV such simplest situations each with probability P = 1/N. 

The 'trace' is a dimensional reduction by summing over all internal coordi- 
nates labeling the micro-ensembles which contribute to the macro-ensemble. 
We are obviously cribbing the fundamental role of the density matrix and 
the implicit role of macro- and micro-ensembles. 

The next step is the stuff of legend: We imagine a brilliant 24 year old on 
a solitary vacation on a rain-swept rocky shore, daydreaming about the dy- 
namical problem of calculating the probabilities P rather than just assigning 
them, as we progress from thermodynamics to the next more fundamental 
theoretical level of statistical mechanics and ultimately to a fully dynamical 
theory. Perhaps from analogy with Maxwell's dynamical equations for field 
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amplitudes rather than for positive field energy densities, our young genius 
decides to: 

a) factorize the probability P into the suggestive form 

P ^ $ 2 = $ x f . 

This form has the virtue that P > for arbitrary real Factorization 
leads to dynamical quantities satisfying the anticipated requirements of vari- 
ational calculus where the virtual variations in \l/ - 5^/ - are unconstrained 
by the requirement that P > 0. Here we have to borrow heavily from clas- 
sical mechanics and the requirements on classical dynamical variables that 
they be continuous and differentiable. This motivates us to search for an 
analytic dynamical theory for the probabilities P, but that is impossible. 
The normalization of the probabilities - trP = 1 - is a holonomic constraint 
(i.e., an equality constraining the candidate generalized coordinates P) in 
Goldstein's classification [6], and is manageable in the context of classical 
mechanics simply by eliminating one P. However, the positivity condition 
on the probabilities - P > - as an inequality is a nonholonomic constraint 
which "there is no general way of attacking" . 

A factorizable probability is the especially simple case of a pure state but 
it is an immensely instructive warm-up exercise. Proceeding, 

b) we require \1/ to be more than just \f~P '. We require \& to be analytic and 
differentiable in its variables, and therefore to have the possibility of sign- 
changes where P = 0. 

c) The invariance of P and S under a simple sign-change of \& is further 
extended by allowing \1/ to be complex so P — > tyty*. Why? Why complex 
numbers? Our basic response is: Why not? Our only requirement was that 
P > 0, so to choose \1/ real was to overconstrain it. With complex \l/ 

d) the full invariance group of P and S is the group of all unitary transfor- 
mations U on with WlJ = 1 

* -> = UV, 

and the merit of allowing \l/ to be complex becomes evident. 

Without complex the invariance group of P and S is just the trivial 
change of sign \P — > — ^, and our search for a dynamical theory of P comes 
to a grinding halt. With complex \1/ we are on familiar ground. \1/ is the 
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state function for the macro-ensemble of the density matrix, here reduced 
(trivialized) to the simplest pure-state micro-ensemble. We will relax these 
restrictions soon. 

e) For the system to evolve smoothly in time without change of entropy 
of information, requires the time evolution of the state function $ to be a 
unitary transformation 

\V(t = 0) >^ |tf(t) >= U(t)\V(0) >= e- iHt \ty(0) > 

(going over to the familiar Dirac notation). The invariance of the entropy is 
made clear in this way for our simplest system of a single micro-ensemble, and 
is maintained for a mixture of noninteracting micro-ensembles. Caticha [see 
3] enforces this requirement by requiring the Hilbert space as the "uniquely 
natural" choice. Caticha formalizes this development, but we proceed with 
our 'pretend' discovery scenario. 

Here we have introduced the hermitian generator of infinitesimal time 
translations H = W. This is by definition the Hamiltonian, but we have not 
imposed anything except hermiticity in the new information based dynamics. 
The familiar Hamiltonian dynamics will be a product of the information 
theory. We have also kept to the simplest possible case by taking H to be 
time independent. 

Next, 

f) the Heisenberg equations of motion follow immediately from 

< x (t) > = tr {x(0)P(t)} 

= tr {x(0)U\V(0) >< V(0)\tf} 
= tr {Ui X (0)U\V(0) >< *(0)|}, 

so 

x(t) = U ] x{0)U 

x(t) = tU^[H,x(0)] CB U 
= i[H,x(t)] CB - 

This is the familiar Heisenberg matrix equation of motion involving the com- 
mutator bracket of x(t) with the generator of time translations H. 



and 
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Proceeding ab initio from a Principle of Stationary Information Entropy 
seems logically possible, but we forgo the exercise by identifying H with 
a non-trivial Hamiltonian and introducing for each generalized coordinate 
x the canonically conjugate operator p which diagonalizes the commutation 
relations; then we get the structure of Heisenberg matrix mechanics; and from 
that the Schroedinger derivative representation of the commutation relations, 
and wave mechanics; and finally the classical limit of commutator brackets 
as Poisson brackets. So we could also even 'discover' classical mechanics in 
this inverted logic. 

The Hamiltonian is paramount in this formulation, as it is in ordinary 
non-relativistic quantum mechanics. It remains to find the Lagrangian and 
the Action Principle, both of which seem somewhat contrived and ad hoc in 
the usual derivation of non-relativistic quantum mechanics. Helmholtz and 
Gibbs [7] have showed that the stationary minimum of the free energy A=E- 
TS+PV [8] fulfills the role of Lagrangian in reversible chemical reactions. 
Extending the invariance group of the entropy to the Lorentz group will be 
useful also. 

What can be said when the probability P is not simply factorizable? The 
density matrix [9] must then be written as a sum over micro-ensembles tyj 
each weighted with its own positive probability Pj. (Keep in mind that the 
Pj are now real numbers satisfying Pj > and J2j Pj — 1-) 



in the usual notation. All the above equations are required to survive as 



The conditions for this survival are profound: we require a Hilbert space 
of the \&'s and the corresponding operator or matrix representation of the 
C/'s. We introduced the notation somewhat gratuitously in the above 'pure'- 
case, but now the necessity of the full Hilbert space formalism is clear. If 
we were to make any progress at all in our 'pretend world' in the non-trivial 



p(* = o) = £{l*;>^-<*;l} 
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< x(t) > 

< x(t) > 



tr {U*x(0)Up(0)} 
iti {rf[H,x(p)] C BUp(p)} 
iti{[H,x(t)] CB p(0)}. 
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'mixed'-case we would have had to invent Hilbert space and operator C/'s at 
this point. 



What is a Hilbert space? It is the complex vector space of the eigenvectors 
\^fj > of the hermitian operator H, so 

H\*j >=E j \^ 3 >, 
which have the quadratic norm 

< >= 5(j, k) 

and so are orthogonal. They are complete 

El*i><*il = 1 ' 

j 

where 1 must be 'interpreted'. We are confronted by the seemingly absurd, 
but in fact profound, Clintonesque question: What is 1? 

Two simplest examples are useful to keep in mind: 
a) The simplest is the two dimensional space of a spin-| particle with 



Sx = y, etc. for y,z; 
H = AS Z 

with two eigenstates 

|^ 1 >= a = ^ J j anc i >= (3 = 

Orthonormalization and completeness are apparent. Complex representa- 
tions could equally well have been chosen, e.g., the eigenstates of a x or a y 
instead of those for a z . The complex vector space in this simplest case is 
spanned by two 2-dimensional orthonormal and complete basis-vectors, such 
as a and (5. 

b) A less trivial example is the free particle in three unconstrained dimen- 
sions. The Hamiltonian is 

2m 
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and we choose simultaneous energy and momentum eigenstates 



Vp(x) >=e 



These have continuum orthonormalization 



X 



{2-n)H\k-p), 



and the very similar completeness relation 



/ 
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In this case, the complex vector space is spanned by a mind-boggling array 
of basis-vectors: there is a triple-infinity of orthonormal and complete basis- 
vectors labeled by p, each with a triple-infinity of components labeled by x 
(or vice versa). 

We conclude that a Principle of Stationary Information Entropy, cou- 
pled with some inspired (but a posteriori inevitable) requirements during 
the analysis, can completely reverse the logical structure of quantum me- 
chanics. With this approach, there is no 'interpretation' of quantum me- 
chanics; nor need there be any hesitation or delay in this 'pretend world' 
of making all the usual applications. A Hilbert space is required. Heisen- 
berg matrix-commutator mechanics is required. We can imagine that the 
concept of generalized coordinates and their conjugate momenta would nat- 
urally occur, even without classical mechanics, as an optimal minimum set of 
non-commuting variables defined to diagonalize the commutation relations 
in a standard way. Schroedinger's differential representation would be next 
and with it, all of wave mechanics and its intuitive (but occasionally mislead- 
ing) guidance. And - as now - we can imagine a compelling route even to 
classical physics, but with a deeper understanding of the Principle of Least 
Action. 

3 Adding Structure 

Now we have to ask: What is ^? and what is J2j ? 111 i&ct, what is j? 
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Our answers to these questions must satisfy the requirement that "* is 
the complex resolution of unity" so we have completeness 

i = £I*j><*j-I> 

j 

and orthogonality 

< Vj\9 k >= 5(j,k). 
The density matrix is constructed as 

p = EI*i> p i<*il 

with Pj > and J2j Pj — 1- O ur target structure will be ordinary quantum 
mechanics which will make its appearance in the old-fashioned but explicit 
and intuitive Fock-representation. 

The simplest case has dimension D = 2 and *,'s which are two states 
specified by the elementary Pauli spinors. It is a simple but instructive 
example and proves to be a faithful guide to any level of complexity. We 
have the standard representation 

\^i>=a=(^ 1 Q ^ and |* 2 P = ^ J j • 

The two dimensional (real-)resolution of unity is 

1 = |*i >< *i| + |*2 >< ^2 1 

-(!!)• 

and the density matrix is 

p = |*i > Pi < *i| + |*2 > P2 < *2| 

( o 1 p 2 ) • 

The unitary time evolution of these states which leaves p unchanged is gen- 
erated by the no-interaction Hamiltonian 

H A rr A ( 1 M 

= i z = 2" -1 ■ 
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An interaction between the two states would involve a x and/or a y and 
would induce continuous transformations between the two states, which were 
chosen to diagonalize the initial density matrix. Hardy [1,2] identifies the 
existence of these continuous transformations between basis states as the 
essential distinction between classical probability theory and quantum me- 
chanics. It is seen to arise here from the extension of the 'resolutions of 
unity' - suggested by the requirement that the sought-for conjectured classi- 
cal dynamical variables (~ \TP) satisfy only holonomic constraints - to the 
'complex resolutions of unity'. These must be included for the sake of logical 
completeness. 

We can extend this simplest example in a multitude of directions. 

a) The first extension is almost trivial: generalize the 2-D SU(2) example 
above to other similar groups including the familiar 0(3) and ££/(3). 

b) A second extension is to the interesting dynamical problems which arise 
when the density matrix of two (or more) a priori independent systems is 
defined in the direct product space of the two systems as 

p(l,2)=p(l)(g)p(2). 

This requires basis representations 

#(1,2) = *(l)(g)*(2) 

and the Hamiltonian 

H(l, 2) = H (l) H (2) H int (l, 2) 

which dynamically couples the two systems. Such dual - but isolated - 
systems still fall short of a model for the full measurement process [10]. We 
should not be surprised or disappointed. Standard quantum mechanics is 
our limited goal. 

A third extension required for continuous variables 

c) like x and p is somewhat different but is directly suggested by the £77(2) 
example: it involves judicious replacements of sums by integrals and Kro- 
necker f) by Dirac Soip — p')- For example, the 'complex resolution of 
unity' for a probability distribution defined in momentum space now becomes 

1 => (2vr) 3 5 3 (p — k) — J d 3 xe i( P-^ s 
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and 



One might well ask: Where did the plane wave solutions come from in this 
'pretend world' ? Of course, we have the answer from quantum mechanics as 
we have been taught it; but how would we arrive at this result starting with 
a stationary probability, factored into 'complex resolutions of unity'? 

We must start with a dimensionality chosen for the problem at hand. 
This is where physics, judgement, and discovery enters. The next step is to 
require the time evolution operator to be unitary, and the generator to be a 
hermitian Hamiltonian operator whose commutator with x is the velocity v. 
We can follow two paths here: The easiest one is to take classical mechanics 
as already known, and simply use the Hamiltonian H = p 2 /2m. Then the 
commutation relation requires p = KV/i, and the choice of plane waves 
as 'resolvent' functions is dictated by the resulting Hilbert space. Again 
there are judgements to be made, and we choose a rectangular basis which 
diagonalizes the linear momenta. This choice replaces all operators by their 
eigenvalues. 

A second path suggests itself, as mentioned above, of deducing classical 
mechanics ab initio from the factorization requirement, and diagonalizing the 
commutation relations. In this way, we are lead to 'discover' the generalized 
momentum p x conjugate to the generalized coordinate x. We are limited in 
this way to a unit metric in generalizing sums to integrals, but we do get 
a toehold and could subsequently change to a basis of angular momentum 
eigenstates for example. 

Traditional quantum calculations are quite conveniently made in this den- 
sity function representation. We give a brief heuristic sketch of transitions 
between basis states induced by a non-diagonal interaction Hamiltonian H'. 
The initial density matrix evolves in time to 



p(0) = l^x^l 

- p(t) = e- iH,t p(0)e +iH ' 

-> J^^rn > Pm(t) < 
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The probability Pk(t) at time t of a state k ^ j is 



P k (t) = tr\V k ><y k \p{t) 

= | < * fe |e"^'*|^ > | 2 
~ t 2 \ < V k \H'\Vj > | 2 , 

proportional in first order to the absolute square of the matrix element of 
the perturbing interaction Hamiltonian. The factor t 2 requires some care 
in application to realistic energy conserving transitions but we won't pursue 
that. We should perhaps better recover the Schroedinger equation directly 
to evaluate 

in terms of the energy eigenstates of the full Hamiltonian. In either case, we 
return directly to the usual quantum results. 

4 Concluding Remarks 

Let us summarize and reiterate what has been done. We start with the 
information entropy of a macro-ensemble 

S = -'£{P j \nP j }>0, 

j 

where j enumerates the micro-ensembles occurring in the macro-ensemble 
with probability Pj. Pj must satisfy the obvious restrictions 

< Pj < 1 and Y, p j = L 

3 

We then embark on a 'pretend'-journey of 'discovery' resulting in quantum 
mechanics. 

Our goal is to find a dynamical theory governing the system. We rule 
out the probabilities Pj themselves as candidate fundamental variables of 
such a dynamical theory. The reason is familiar from classical mechanics: 
the inequality Pj > is a non-holonomic constraint and "there is no general 
way" of attacking such problems. This leads us to factorize the probabilities 
ultimately to 

Pj -> ^ 2 and further to -> ^**. 
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a) Why factorization? To satisfy the non-holonomic constraint identically. 

b) Why complex factors? This generalization is obviously permissible and 
thus a logical necessity. 

c) What factors? The answer to this question builds upon experience with 
the simplest system imaginable: the 2-dimensional complex space of spin- 
|. Finally we can conclude that the ^'s are a complete orthonormal set 
of complex basis vectors in the Hilbert space generated by the Hamiltonian 
describing the dynamics of the system under consideration. 

d) What dynamics? To preserve the entropy of a macro-ensemble, we require 
the time dependence of the tyj to be a unitary transformation generated by 
the hermitian time evolution operator (i.e., the Hamiltonian) of the system. 
The actual choice of the Hamiltonian is not made by the quantum theory per 
se but becomes an act of creative judgement subject only to the achievement 
of interesting results. 

e) Elementary considerations lead directly to the Heisenberg equations of 
motion involving the commutator with the Hamiltonian. 

f) The *&j(t) satisfy the Schroedinger equation and in the usual way consti- 
tute the required Hilbert space vectors. 

We could continue in this way, or we could return to ordinary quantum 
mechanics with the - perhaps not so new - understanding of the wave func- 
tions as 'complex resolutions of unity' - i.e., projection operators - onto each 
particular micro-ensemble in the macro-ensemble under consideration. 

What has been gained is not a new quantum mechanics, but a reason for 
the existence of the old one. The existence of quantum mechanics is neces- 
sary in order for there to be a fundamental dynamical theory governing the 
elementary probabilities in the Shannon Information Entropy. In addition, 
this derivation justifies the subjective interpretation of the wave function 
as the encoding of information. Even further, such fundamental entities as 
the Heisenberg matrix equations of motion appear naturally, suggesting the 
fundamental commutation relations diagonalized by the introduction of the 
momentum canonically conjugate to each independent coordinate; and even 
the derivation of classical mechanics from this quantum mechanics, and an 
alternative to the Principle of Least Action; all from a Principle of Stationary 
Entropy. 
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As advertized, our development is intended to be a 'pretend' voyage of 
discovery, not a mathematical derivation of quantum mechanics. Caticha 
[3] points out the many logical shortcomings in our too-facile assumptions, 
some of which we describe below. First, however, let us point out that 
the possibilities employed in our scenario do exist as a path through the 
maze. Our specific choice of complex numbers, Hilbert spaces, and unitary 
transformations turns out to be sufficient and self- consistent but not proven 
to be necessary. Caticha [3] points out the possibility of Clifford algebras [1 1] 
of real vectors inter alia. These do have a possible presence in extensions of 
quantum mechanics to include Weyl spinors and Grassmann variables, but 
the onus so far has been on the conformability of these structures with the 
pre-existing structure of quantum mechanics. 

Our philosophy is a pragmatic positivist one. In this view, every exception 
is to be viewed not as a barrier to progress, but as an opportunity. It is 
an appeal to a Correspondence Principle. At the same time, we have to 
acknowledge the possibility that some arcana really are really mysterious. 
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